microRNA sequencing for biomarker detection in the diagnosis, classification and prognosis of Diffuse Large B Cell Lymphoma

Despite being considered a single disease, Diffuse Large B Cell Lymphoma (DLBCL) presents with variable backgrounds, which results in heterogeneous outcomes among patients, with 40% of them still having primary refractory disease or relapse. Thus, novel biomarkers are needed. In addition, multiple factors regarding its pathogenesis remain unclear. In this context, recent investigations point to the relevance of microRNAs (miRNAs) in cancer. However, regarding DLBCL, there is inconsistency in the data reported. Therefore, in this work, the main goals were to determine a miRNA set with utility as biomarkers for DLBCL diagnosis, classification, prognosis and treatment response, as well as to decipher the mechanism of action of deregulated miRNAs in the origin of the disease. We analyzed miRNA expression in a cohort of 78 DLBCL patients and 17 controls using small RNA sequencing and performed a miRNA-mRNA interaction network analysis. This way, we were able to define new miRNA expression signatures for diagnosis, classification, treatment response and prognosis, and we identified plausible mechanisms of action by which deregulated miRNAs could be involved in DLBCL pathogenesis. In summary, our study remarks that miRNAs could play an important role in DLBCL.

www.nature.com/scientificreports/ set of selected miRNAs, which limits the number of comparable results and centers the discussion on those miRNAs that are better known, leaving other miRNAs aside. Therefore, we considered that it was necessary to perform large-scale studies with a wider array of miRNAs using techniques such as next-generation sequencing that allow a deeper and unbiased identification of miRNAs with the potential to be used as biomarkers in DLBCL. Moreover, this technology allows the characterization of the whole miRNAome, providing an insight in the global mechanism of action of deregulated miRNAs. Consequently, the main goals of the present study were to determine a miRNA set with utility in DLBCL diagnosis, classification, prognosis and treatment response, through the analysis of all the expressed miRNAs in DLBCL using small RNA sequencing, as well as to decipher the mechanism of action of deregulated miRNAs in the origin of the disease through the development of a miRNA-mRNA interaction network.

Results
MiRNAs deregulated in DLBCL. A total of 1584 miRNAs were identified through the miRNA sequencing analysis. A principal component analysis (PCA) was performed including the expression of all the miRNAs to obtain an overview of the grouping of samples according to their variance. Patient and control samples cluster as clearly separate groups ( Supplementary Fig. S1). To obtain a comprehensive list of miRNAs deregulated in DLBCL, we compared the expression of each miRNA in 78 DLBCL samples at diagnosis with 17 non-tumoral ganglia from individuals without DLBCL. We noted that 146 miRNAs exhibited statistically significant differences in expression between samples of DLBCL patients at diagnosis and control samples. Of these miRNAs, 122 exhibited increased abundance in DLBCL, while 24 miRNAs exhibited decreased abundance. The 20 most discriminatory up-and down-regulated miRNAs are listed in Table 1.
MiRNAs associated with DLBCL treatment response. Eleven miRNAs were differentially expressed between samples at diagnosis from patients with long-term complete remission (n = 50) and refractory patients (n = 12), of which, 10 miRNAs were upregulated in patients with complete remission (miR-12136, miR-129-5p, miR-129-1-3p, miR-3150b-3p, miR-127-3p, miR-3681-5p, miR-370-3p, miR-4464, miR-129-5p and miR-3928-3p) and one miRNA was downregulated in the same group of patients (miR-192-5p) ( Table 3). www.nature.com/scientificreports/ MiRNAs as prognostic biomarkers in DLBCL and their impact on survival. MiRNA expression at diagnosis of 50 patients with long-term remission and 16 patients that relapsed within 10 years after diagnosis was compared. The results revealed that seven miRNAs (miR-4444, miR-449c-5p, miR-3681-5p, miR-3928-3p, miR-449b-5p, miR-370-3p, and miR-4424) were significantly upregulated in patients with long-term remission and three miRNAs (miR-133a-3p, miR-208b-3p, and miR-205-5p) were significantly upregulated in patients that relapsed (Table 4). Patient survival curves were generated to represent 5-year progression-free survival (PFS) and overall survival (OS), dividing the patients into high-expression and low-expression groups according to the mean expression of each of the ten miRNAs significantly associated with prognosis. The long-rank test revealed that, among the miRNAs associated with good prognosis, high expression of miR-370-3p was associated with better 5-year PFS (OS p-value = 0.17, and PFS p-value = 0.025) (Fig. 1). We also performed Cox PH multivariate analysis including two established indicators of DLBCL patient outcome (COO-Subtype and IPI) as covariates. However, the results of this analysis revealed that miR-370-3p was not associated with PFS independently of IPI and subtype. miRNA-mRNA interaction network analysis. For the 122 upregulated miRNAs found in our study (DLBCL vs controls), we identified a total of 482 experimentally validated targets using MirTarbase. Regarding the 24 downregulated miRNAs from our study, a total of 375 experimentally validated target genes were identified. Analysis of the GSE56315 dataset identified through the search in the GEO, revealed a total of 1918 significantly upregulated genes and 4545 downregulated genes in DLBCL patients compared with healthy control individuals.
Integration of both datasets showed that downregulated miRNAs targeted a total of 138 genes among those highly expressed in DLBCL (Supplementary Table S1), and upregulated miRNAs a total of 76 targets among those genes downregulated in DLBCL (Supplementary Table S2). Some miRNAs showed interactions with several targets (Supplementary Table S3). For instance, miR-9-5p was shown to target 25 genes downregulated in DLBCL, and miR-146a-5p and miR-182-5p were shown to target 24 and 20 genes downregulated in DLBCL, respectively.  www.nature.com/scientificreports/ Some target genes showed interactions with several miRNAs (Supplementary Table S4). For instance, FOXO1, BCL2, GS3KB and PTEN, downregulated in DLBCL, are targeted by 7, 5, 5 and 4 miRNAs deregulated in DLBCL, respectively. In order to evaluate the pathways that could be affected by the deregulated miRNAs through their target genes, we performed a pathway enrichment analysis using the ConsensusPathDB web tool. Among the most significantly over-represented pathways, we identified some involved in cancer, such as FoxO signaling pathway, signaling by Receptor Tyrosine Kinases, and PI3K-Akt signaling pathway (Fig. 2).

Discussion
In this study, we identified new signatures of miRNAs of relevance in DLBCL with potential to improve diagnosis, subtype characterization and treatment response through small RNA sequencing. To our knowledge, few reports exist in which miRNA sequencing were used to identify miRNA signatures in cancer, and only one which analyzed miRNAs with Next Generation Sequencing (NGS) in DLBCL 9 . Notably, deregulation of most of these miRNAs had not been reported previously in DLBCL.
Comparing miRNA expression in DLBCL GCB and Non-GCB subgroups, eight miRNAs were identified with subtype differentiation potential. Five of them were upregulated in GCB DLBCL patients (miR-129-2-3p,  www.nature.com/scientificreports/ miR-4464, miR-3150b-3p, miR-138-5p and miR-129-5p) and three were upregulated in Non-GCB subtype (miR-511-5p, miR-205-5p, and miR-3652). Of note, miR-28-5p, which was proposed as a biomarker of GCB DLBCL in our previous systematic review, pending further confirmation 16 , was also differentially expressed among subtypes, although the logFC did not reach the threshold set for the present study. Interestingly, higher levels of miR-129-5p and miR-138-5p in the GCB subtype had been previously reported 6,9,14,17 . As part of the miRNA-mRNA interaction network analysis, we observed that they could target transcripts that are known to be deregulated in the formation of germinal center lymphomas 18 , including genes related with the cell cycle (CDKN1A), MAPK and NFkB signaling (MAPK1). Interestingly, we also found TP53 as a target gene of miR-129-3p, miR-129-5p, and miR-138-5p. TP53 gene is a crucial tumor suppressor that mediates cell-cycle arrest, DNA repair, apoptosis, senescence, and autophagy 19,20 , and its dysfunction is implicated in lymphomagenesis and disease progression. Recent studies have shown that TP53 mutations have prognostic value in GCB-DLBCL, but not in Non-GCB-DLBCL 21 , which could suggest that GCB-DLBCL is TP53-dependent while Non-GCB DLBCL is TP53-independent, and thus, could support the role of miR-129-3p, miR-129-5p, and miR-138-5p in GBC subtype. We also identified a predictive miRNA signature for therapy response, which included the upregulation of ten miRNAs (miR-12136, miR-129a-5p, miR-129-1-3p, miR-3150b-3p, miR-127-3p, miR-3681-5p, miR-370-3p, miR-4464, miR-129b-5p and miR-3928-3p) in association with good response and upregulation of miR-192-5p associated with chemoresistance to R-CHOP DLBCL treatment. Notably, deregulation profiling of most of these miRNAs in DLBCL had not been previously reported, with a limited number of studies carried out; but some of them, such as miR-192 had been studied in the context of other types of cancer 22,23 , supporting their role in drug response. Interestingly, miR-370-3p has been recently shown as a target of circ_0000877 and LINC00857 mediating their effect on DLBCL cells survival, which would be in agreement with its association with good response to therapy 24,25 . In addition to being putative biomarkers for R-CHOP treatment response prediction, these miRNAs could also be considered in the future as novel therapeutic targets for personalized miRNA-based therapy, which could be an alternative adjuvant therapy, although further studies are needed in this area of research before such therapies can be translated to the clinic.
In addition to treatment response, the ability to predict survival accurately may be crucial for initial treatment planning in patients with DLBCL. We identified seven miRNAs (miR-4444, miR-449c-5p, miR-3681-5p, miR-3928-3p, miR-449b-5p, miR-370-3p, miR-4424) significantly upregulated in patients with long-term remission and four miRNAs (miR-133a-3p, miR-133a-3p, miR-208b-3p, miR-205-5p) upregulated in relapsed patients. In addition, the survival analysis reveals that high expression of miR-370-3p is significantly associated with PFS. However, the association was not maintained when IPI and subtype were included in the model. As mentioned above, its role as a mediator of DLBCL cell survival in vitro has been stablished in two different studies. Thus, further studies are warranted to confirm its role as outcome predictor.
To gain insight into the global mechanism of action of deregulated miRNAs, we also performed a miRNA-mRNA interaction network analysis. Integration analysis determined that up to 214 genes could have altered www.nature.com/scientificreports/ expression in DLBCL because of miRNA deregulation. In this line, there were 138 genes with low expression in DLBCL that could be due, at least in part, to the increased expression of miRNAs identified in the current study. Moreover, 76 genes were highly expressed that could be due to a lack of regulation by the miRNAs lowly expressed here identified. Among the most interesting downregulated genes targeted by more than one miRNA overexpressed in DLBCL, we identified FOXO1 and PTEN, involved in PI3K-Akt pathway 26 , predicted to be targeted by 5 (miR-182-5p, miR-183-5p, miR-135a-5p, miR-9-5p, and miR-9-3p) and 4 (miR-155, miR-320a, miR-205 and miR-182-5p) overexpressed miRNAs, respectively. FOXO1 is a tumor suppressor and its downregulation leads to promoting tumorigenesis by favoring resistance to stress, proliferation, and increased cellular survival 27 . Abnormal cells that would normally undergo apoptosis may instead survive in the absence of FOXO1, which results in tumor expansion 28 . In fact, downregulation of FOXO1 has been previously reported in DLBCLs 29 . In the same line, PTEN is a tumor suppressor that acts upstream of FOXO1 negatively regulating AKT. Without PTEN regulation, activated AKT phosphorylates FOXO1, which is subsequently exported from the nucleus into the cytoplasm and degraded by the proteasome 30 . Interestingly, PTEN is the major negative regulator of the PI3K/ AKT signaling pathway, which is constitutively activated in 25-50% of DLBCL according to recent studies 31 . Interestingly, miR-182-5p targeted both genes and could be a crucial miRNA modulating the PI3K/AKT pathway by targeting PTEN, upstream, and FOXO1, downstream. Further supporting our findings, a relation between miR-182 and miR-183 and FOXO1 suppression have been previously described in classical Hodgkin lymphoma 32 .
Regarding the upregulated genes in DLBCL that could be a consequence of miRNA downregulation, the most significant result is BCL2. Evidence revealed that elevated expression of anti-apoptotic members such as Bcl-2 is one of the major contributing factors to B cell lymphomagenesis 33 . Since then, many studies have determined that BCL2 is one of the most important oncogenes in cancer and that it is linked with lymphoma development, particularly when c-MYC is overexpressed 34 . This gene could be regulated negatively by miR-135a-5p, miR-234-5p, miR-139-5p, miR-451a, and miR-497-5p. Therefore, the low expression of the mentioned miRNAs observed in DLBCL would contribute to BCL2 overexpression. In fact, previous studies have confirmed that deletion or downregulation of miRNAs is a mechanism for BCL2 overexpression 35 .
Finally, this study has some limitations that need to be addressed. First, we analyzed miRNA expression from formalin-fixed paraffin-embedded (FFPE) blocks, in which RNA quality is usually compromised. However, compared to mRNA, miRNAs are relatively resistant to RNase degradation 36 , and previous studies demonstrated that miRNA expression from FFPE samples is in good correlation with fresh frozen samples 37 . Secondly, patients were diagnosed in three different hospitals with different follow up routines, which could be a source of heterogeneity, although the treatment protocols were similar. Finally, we also have to mention that in this study the Han's IHC algorithm was used to classify the patients into GCB and Non-GCB subtypes. Even though studies have shown clearly that the IHC-based classification provided very similar outcome prediction compared with the gene expression profiling based classification, which is the gold standard 38 , variable reproducibility of IHC stains and interpretations, may be the reason for inconsistent results 39 .
In conclusion, in the present study, a comprehensive analysis of miRNA profiles by next-generation sequencing technology allowed us to identify signatures that could be of relevance in DLBCL. Prospective studies are warranted to further validate our findings and explore new therapeutic options based on the miRNA profile of each patient. Moreover, we explored the mechanisms by which deregulated miRNAs contributes to DLBCL pathogenesis, with miR-182-5p playing a key role in PI3K/AKT pathway.

Materials and methods
Population of study. Samples from 78 DLBCL patients were obtained at diagnosis from formalin-fixed paraffin-embedded (FFPE) tumor tissue (50 patients with long-term complete remission, 12 patients with refractory disease, and 16 patients that relapsed within 10 years after diagnosis). Samples were collected from 1999 to 2018 at the Hematology Units of 3 Spanish reference hospitals (Cruces University Hospital, Donostia University Hospital, and Araba University Hospital). Control samples were obtained from non-tumoral ganglia from individuals without DLBCL, which were collected at Araba University Hospital. Demographic and clinical data were obtained from patient's medical files by two independent clinical researchers ( Table 5). Demographic information was also obtained for the controls. All patients were treated with R-CHOP or similar chemotherapy regimens, which included rituximab. The study was approved by the Clinical Research Ethical Committee of the Basque Country (P2016121). Signed informed consent was obtained from each participant and the study was carried out according to the Declaration of Helsinki.
Sample processing, Small RNA-seq library preparation and sequencing. RNA was isolated from FFPE samples with the miRNeasy FFPE Kit (Qiagen, Hilden, Germany). One µg of total RNA was used for library preparation with TruSeq small RNA Sample Prep Kit (Illumina), according to the manufacturer's protocol. Libraries were analyzed using Agilent DNA High Sensitivity chip and sequenced Single Read, 50nts (v4) on Illumina's HiSeq 2500 with a depth of approximately 10 million reads per sample. All the process was carried out at the Centre for Genomic Regulation (CRG). The data are openly available in GEO database, reference number GSE185796 40 . Bioinformatic analysis and differential miRNA expression. Reads were trimmed for the presence of the small RNA adapter using Skewer and filtered removing every sequence shorter than 15 and longer than 40 bases. The remaining ones were then aligned to the reference genome (GRCh38) using ShortStack and the resulting alignments used by htseq-count for determining the number of tags per gene. We used the annotation of the miRBase Sequence Database v22.1mirbase in order to select only miRNA genes. Every small RNA with less than 10 reads considering the sum of the reads in every condition was excluded, and we used the remain- www.nature.com/scientificreports/ ing genes for differential expression analysis with DESeq2. p-values were adjusted using False discovery rate (p-adj). Differentially expressed miRNAs were those with p-adj < 0.05 and a log 2 fold change (logFC) > 2 or < − 2. Survival analysis. Survival analyses were conducted to confirm the absence of effect of changes in chemotherapy (standard vs. intensified) on survival. Additional survival analyses were performed to estimate the effect on the progression-free survival (PFS) and overall survival (OS) of those miRNAs differentially expressed between samples at diagnosis from patients with long-term remission and those that relapsed. The median value of the expression of those differentially expressed miRNAs was considered to categorize patients into low or high expression groups. Kaplan-Meier analysis was used to define the survival curves and the log-rank test was used to assess significance. For the multivariate analysis, we also considered the subtype and IPI status using the Cox proportional hazards (Cox PH) method. All calculations were performed using the Survival R package. Statistically significant associations with survival were those with p-value < 0.05. miRNA-mRNA interaction network. Using our own data on miRNA expression (DLBCL patients vs controls) and mRNA expression data contained in the Gene Expression Omnibus (GEO) database, we constructed a miRNA-mRNA interaction network to elucidate the mechanism of DLBCL development. First, for those miRNAs deregulated in patients compared to controls, target genes were identified using mirTarbase 41 (http:// mirta rbase. cuhk. edu. cn/ php/ index. php) to search for experimentally validated miRNA-mRNA interactions. For the current study, only strong miRNA-mRNA interactions were selected. Secondly, after a comprehensive search of databases in GEO (http:// www. ncbi. nlm. nih. gov/ geo/), the only microarray dataset (GSE56315) containing gene expression information of both DLBCL samples (n = 55) and healthy controls (n = 33) was downloaded 42,43 . Differentially expressed genes (DEG) were identified using GEO2R tool with default parameters. We established the following inclusion criteria for the DEGs: upregulated genes must have a logFC ≥ 2 and p-adj < 0.05, while downregulated genes must have a logFC ≤ − 2 and p-adj < 0.05. Finally, experimentally validated targets of differentially deregulated miRNAs were overlapped with mRNAs differentially expressed obtained from GEO databases as represented in Supplementary Fig. S2. The intersection of the upregulated and downregulated genes was mapped using the Venn package (http:// bioin forma tics. psb. ugent. be/ webto ols/ Venn/). Signaling pathway enrichment analysis (Consensus Pathway database http:// cpdb. molgen. mpg. de/) were conducted with overlapping genes. www.nature.com/scientificreports/ Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http:// creat iveco mmons. org/ licen ses/ by/4. 0/.